AAAI.2022 - Constraint Satisfaction and Optimization

Total: 25

#1 Undercover Boolean Matrix Factorization with MaxSAT [PDF] [Copy] [Kimi]

Authors: Florent Avellaneda ; Roger Villemaire

The k-undercover Boolean matrix factorization problem aims to approximate a m×n Boolean matrix X as the Boolean product of an m×k and a k×n matrices A◦B such that X is a cover of A◦B, i.e., no representation error is allowed on the 0’s entries of the matrix X. To infer an optimal and “block-optimal” k-undercover, we propose two exact methods based on MaxSAT encodings. From a theoretical standpoint, we prove that our method of inferring “block-optimal” k-undercover is a (1 - 1/e) ≈ 0.632 approximation for the optimal k-undercover problem. From a practical standpoint, experimental results indicate that our “block-optimal” k-undercover algorithm outperforms the state-of-the-art even when compared with algorithms for the more general k-undercover Boolean Matrix Factorization problem for which only minimizing reconstruction error is required.

#2 Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach [PDF] [Copy] [Kimi]

Authors: Qinbo Bai ; Amrit Singh Bedi ; Mridul Agarwal ; Alec Koppel ; Vaneet Aggarwal

Reinforcement learning is widely used in applications where one needs to perform sequential decisions while interacting with the environment. The problem becomes more challenging when the decision requirement includes satisfying some safety constraints. The problem is mathematically formulated as constrained Markov decision process (CMDP). In the literature, various algorithms are available to solve CMDP problems in a model-free manner to achieve epsilon-optimal cumulative reward with epsilon feasible policies. An epsilon-feasible policy implies that it suffers from constraint violation. An important question here is whether we can achieve epsilon-optimal cumulative reward with zero constraint violations or not. To achieve that, we advocate the use of a randomized primal-dual approach to solve the CMDP problems and propose a conservative stochastic primal-dual algorithm (CSPDA) which is shown to exhibit O(1/epsilon^2) sample complexity to achieve epsilon-optimal cumulative reward with zero constraint violations. In the prior works, the best available sample complexity for the epsilon-optimal policy with zero constraint violation is O(1/epsilon^5). Hence, the proposed algorithm provides a significant improvement compared to the state of the art.

#3 GEQCA: Generic Qualitative Constraint Acquisition [PDF] [Copy] [Kimi]

Authors: Mohamed-Bachir Belaid ; Nassim Belmecheri ; Arnaud Gotlieb ; Nadjib Lazaar ; Helge Spieker

Many planning, scheduling or multi-dimensional packing problems involve the design of subtle logical combinations of temporal or spatial constraints. On the one hand, the precise modelling of these constraints, which are formulated in various relation algebras, entails a number of possible logical combinations and requires expertise in constraint-based modelling. On the other hand, active constraint acquisition (CA) has been used successfully to support non-experienced users in learning conjunctive constraint networks through the generation of a sequence of queries. In this paper, we propose GEACQ, which stands for Generic Qualitative Constraint Acquisition, an active CA method that learns qualitative constraints via the concept of qualitative queries. GEACQ combines qualitative queries with time-bounded path consistency (PC) and background knowledge propagation to acquire the qualitative constraints of any scheduling or packing problem. We prove soundness, completeness and termination of GEACQ by exploiting the jointly exhaustive and pairwise disjoint property of qualitative calculus and we give an experimental evaluation that shows (i) the efficiency of our approach in learning temporal constraints and, (ii) the use of GEACQ on real scheduling instances.

#4 Certified Symmetry and Dominance Breaking for Combinatorial Optimisation [PDF] [Copy] [Kimi]

Authors: Bart Bogaerts ; Stephan Gocht ; Ciaran McCreesh ; Jakob Nordström

Symmetry and dominance breaking can be crucial for solving hard combinatorial search and optimisation problems, but the correctness of these techniques sometimes relies on subtle arguments. For this reason, it is desirable to produce efficient, machine-verifiable certificates that solutions have been computed correctly. Building on the cutting planes proof system, we develop a certification method for optimisation problems in which symmetry and dominance breaking are easily expressible. Our experimental evaluation demonstrates that we can efficiently verify fully general symmetry breaking in Boolean satisfiability (SAT) solving, thus providing, for the first time, a unified method to certify a range of advanced SAT techniques that also includes XOR and cardinality reasoning. In addition, we apply our method to maximum clique solving and constraint programming as a proof of concept that the approach applies to a wider range of combinatorial problems.

#5 The Perils of Learning Before Optimizing [PDF] [Copy] [Kimi]

Authors: Chris Cameron ; Jason Hartford ; Taylor Lundy ; Kevin Leyton-Brown

Formulating real-world optimization problems often begins with making predictions from historical data (e.g., an optimizer that aims to recommend fast routes relies upon travel-time predictions). Typically, learning the prediction model used to generate the optimization problem and solving that problem are performed in two separate stages. Recent work has showed how such prediction models can be learned end-to-end by differentiating through the optimization task. Such methods often yield empirical improvements, which are typically attributed to end-to-end making better error tradeoffs than the standard loss function used in a two-stage solution. We refine this explanation and more precisely characterize when end-to-end can improve performance. When prediction targets are stochastic, a two-stage solution must make an a priori choice about which statistics of the target distribution to model---we consider expectations over prediction targets---while an end-to-end solution can make this choice adaptively. We show that the performance gap between a two-stage and end-to-end approach is closely related to the \emph{price of correlation} concept in stochastic optimization and show the implications of some existing POC results for the predict-then-optimize problem. We then consider a novel and particularly practical setting, where multiple prediction targets are combined to obtain each of the objective function’s coefficients. We give explicit constructions where (1) two-stage performs unboundedly worse than end-to-end; and (2) two-stage is optimal. We use simulations to experimentally quantify performance gaps and identify a wide range of real-world applications from the literature whose objective functions rely on multiple prediction targets, suggesting that end-to-end learning could yield significant improvements.

#6 A Lyapunov-Based Methodology for Constrained Optimization with Bandit Feedback [PDF] [Copy] [Kimi]

Authors: Semih Cayci ; Yilin Zheng ; Atilla Eryilmaz

In a wide variety of applications including online advertising, contractual hiring, and wireless scheduling, the controller is constrained by a stringent budget constraint on the available resources, which are consumed in a random amount by each action, and a stochastic feasibility constraint that may impose important operational limitations on decision-making. In this work, we consider a general model to address such problems, where each action returns a random reward, cost, and penalty from an unknown joint distribution, and the decision-maker aims to maximize the total reward under a budget constraint B on the total cost and a stochastic constraint on the time-average penalty. We propose a novel low-complexity algorithm based on Lyapunov optimization methodology, named LyOn, and prove that for K arms it achieves square root of KBlog(B) regret and zero constraint-violation when B is sufficiently large. The low computational cost and sharp performance bounds of LyOn suggest that Lyapunov-based algorithm design methodology can be effective in solving constrained bandit optimization problems.

#7 Resolving Inconsistencies in Simple Temporal Problems: A Parameterized Approach [PDF] [Copy] [Kimi]

Authors: Konrad K. Dabrowski ; Peter Jonsson ; Sebastian Ordyniak ; George Osipov

The simple temporal problem (STP) is one of the most influential reasoning formalisms for representing temporal information in AI. We study the problem of resolving inconsistency of data encoded in the STP. We prove that the problem of identifying a maximally large consistent subset of data is NP-hard. In practical instances, it is reasonable to assume that the amount of erroneous data is small. We therefore parameterize by the number of constraints that need to be removed to achieve consistency. Using tools from parameterized complexity we design fixed-parameter tractable algorithms for two large fragments of the STP. Our main algorithmic results employ reductions to the Directed Subset Feedback Arc Set problem and iterative compression combined with an efficient algorithm for the Edge Multicut problem. We complement our algorithmic results with hardness results that rule out fixed-parameter tractable algorithms for all remaining non-trivial fragments of the STP (under standard complexity-theoretic assumptions). Together, our results give a full classification of the classical and parameterized complexity of the problem.

#8 Efficient Riemannian Meta-Optimization by Implicit Differentiation [PDF] [Copy] [Kimi]

Authors: Xiaomeng Fan ; Yuwei Wu ; Zhi Gao ; Yunde Jia ; Mehrtash Harandi

To solve optimization problems with nonlinear constrains, the recently developed Riemannian meta-optimization methods show promise, which train neural networks as an optimizer to perform optimization on Riemannian manifolds. A key challenge is the heavy computational and memory burdens, because computing the meta-gradient with respect to the optimizer involves a series of time-consuming derivatives, and stores large computation graphs in memory. In this paper, we propose an efficient Riemannian meta-optimization method that decouples the complex computation scheme from the meta-gradient. We derive Riemannian implicit differentiation to compute the meta-gradient by establishing a link between Riemannian optimization and the implicit function theorem. As a result, the updating our optimizer is only related to the final two iterations, which in turn speeds up our method and reduces the memory footprint significantly. We theoretically study the computational load and memory footprint of our method for long optimization trajectories, and conduct an empirical study to demonstrate the benefits of the proposed method. Evaluations of three optimization problems on different Riemannian manifolds show that our method achieves state-of-the-art performance in terms of the convergence speed and the quality of optima.

#9 Faster Algorithms for Weak Backdoors [PDF] [Copy] [Kimi]

Authors: Serge Gaspers ; Andrew Kaploun

A weak backdoor, or simply a backdoor, for a Boolean SAT formula F into a class of SAT formulae C is a partial truth assignment T such that F[T] is in C and satisfiability is preserved. The problem of finding a backdoor from class C1 into class C2, or WB(C1,C2), can be stated as follows: Given a formula F in C1, and a natural number k, determine whether there exists a backdoor for F into C2 assigning at most k variables. The class 0-Val contains all Boolean formulae with at least one negative literal in each clause. We design a new algorithm for WB(3CNF, 0-Val) by reducing it to a local search variant of 3-SAT. We show that our algorithm runs in time O*(2.562^k), improving on the previous state-of-the-art of O*(2.85^k). Here, the O* notation is a variant of the big-O notation that allows to omit polynomial factors in the input size. Next, we look at WB(3CNF, Null), where Null is the class consisting of the empty formula. This problem was known to have a trivial running time upper bound of O*(6^k) and can easily be solved in O*(3^k) time. We use a reduction to Conflict-Free-d-Hitting-Set to prove an upper bound of O*(2.2738^k), and also prove a lower bound of 2^o(k) assuming the Exponential Time Hypothesis. Finally, Horn is the class of formulae with at most one positive literal per clause. We improve the previous O*(4.54^k) running time for WB(3CNF, Horn) problem to O*(4.17^k), by exploiting the structure of the SAT instance to give a novel proof of the non-existence of the slowest cases after a slight restructuring of the branching priorities.

#10 A Divide and Conquer Algorithm for Predict+Optimize with Non-convex Problems [PDF] [Copy] [Kimi]

Authors: Ali Ugur Guler ; Emir Demirović ; Jeffrey Chan ; James Bailey ; Christopher Leckie ; Peter J. Stuckey

The predict+optimize problem combines machine learning and combinatorial optimization by predicting the problem coefficients first and then using these coefficients to solve the optimization problem. While this problem can be solved in two separate stages, recent research shows end to end models can achieve better results. This requires differentiating through a discrete combinatorial function. Models that use differentiable surrogates are prone to approximation errors, while existing exact models are limited to dynamic programming, or they do not generalize well with scarce data. In this work we propose a novel divide and conquer algorithm based on transition points to reason over exact optimization problems and predict the coefficients using the optimization loss. Moreover, our model is not limited to dynamic programming problems. We also introduce a greedy version, which achieves similar results with less computation. In comparison with other predict+optimize frameworks, we show our method outperforms existing exact frameworks and can reason over hard combinatorial problems better than surrogate methods.

#11 Computing Diverse Shortest Paths Efficiently: A Theoretical and Experimental Study [PDF] [Copy] [Kimi]

Authors: Tesshu Hanaka ; Yasuaki Kobayashi ; Kazuhiro Kurita ; See Woo Lee ; Yota Otachi

Finding diverse solutions in combinatorial problems recently has received considerable attention (Baste et al. 2020; Fomin et al. 2020; Hanaka et al. 2021). In this paper we study the following type of problems: given an integer k, the problem asks for k solutions such that the sum of pairwise (weighted) Hamming distances between these solutions is maximized. Such solutions are called diverse solutions. We present a polynomial-time algorithm for finding diverse shortest st-paths in weighted directed graphs. Moreover, we study the diverse version of other classical combinatorial problems such as diverse weighted matroid bases, diverse weighted arborescences, and diverse bipartite matchings. We show that these problems can be solved in polynomial time as well. To evaluate the practical performance of our algorithm for finding diverse shortest st-paths, we conduct a computational experiment with synthetic and real-world instances. The experiment shows that our algorithm successfully computes diverse solutions within reasonable computational time.

#12 Optimizing Binary Decision Diagrams with MaxSAT for Classification [PDF] [Copy] [Kimi]

Authors: Hao Hu ; Marie-José Huguet ; Mohamed Siala

The growing interest in explainable artificial intelligence(XAI) for critical decision making motivates the need for interpretable machine learning (ML) models. In fact, due to their structure (especially with small sizes), these models are inherently understandable by humans. Recently, several exact methods for computing such models are proposed to overcome weaknesses of traditional heuristic methods by providing more compact models or better prediction quality. Despite their compressed representation of Boolean functions, Binary decision diagrams (BDDs) did not gain enough interest as other interpretable ML models. In this paper, we first propose SAT-based models for learning optimal BDDs (in terms of the number of features) that classify all input examples. Then, we lift the encoding to a MaxSAT model to learn optimal BDDs in limited depths, that maximize the number of examples correctly classified. Finally, we tackle the fragmentation problem by introducing a method to merge compatible subtrees for the BDDs found via the MaxSAT model. Our empirical study shows clear benefits of the proposed approach in terms of prediction quality and interpretability (i.e., lighter size) compared to the state-of-the-art approaches.

#13 Using MaxSAT for Efficient Explanations of Tree Ensembles [PDF] [Copy] [Kimi]

Authors: Alexey Ignatiev ; Yacine Izza ; Peter J. Stuckey ; Joao Marques-Silva

Tree ensembles (TEs) denote a prevalent machine learning model that do not offer guarantees of interpretability, that represent a challenge from the perspective of explainable artificial intelligence. Besides model agnostic approaches, recent work proposed to explain TEs with formally-defined explanations, which are computed with oracles for propositional satisfiability (SAT) and satisfiability modulo theories. The computation of explanations for TEs involves linear constraints to express the prediction. In practice, this deteriorates scalability of the underlying reasoners. Motivated by the inherent propositional nature of TEs, this paper proposes to circumvent the need for linear constraints and instead employ an optimization engine for pure propositional logic to efficiently handle the prediction. Concretely, the paper proposes to use a MaxSAT solver and exploit the objective function to determine a winning class. This is achieved by devising a propositional encoding for computing explanations of TEs. Furthermore, the paper proposes additional heuristics to improve the underlying MaxSAT solving procedure. Experimental results obtained on a wide range of publicly available datasets demonstrate that the proposed MaxSAT-based approach is either on par or outperforms the existing reasoning-based explainers, thus representing a robust and efficient alternative for computing formal explanations for TEs.

#14 Finding Backdoors to Integer Programs: A Monte Carlo Tree Search Framework [PDF] [Copy] [Kimi]

Authors: Elias B. Khalil ; Pashootan Vaezipoor ; Bistra Dilkina

In Mixed Integer Linear Programming (MIP), a (strong) backdoor is a ``small" subset of an instance's integer variables with the following property: in a branch-and-bound procedure, the instance can be solved to global optimality by branching only on the variables in the backdoor. Constructing datasets of pre-computed backdoors for widely used MIP benchmark sets or particular problem families can enable new questions around novel structural properties of a MIP, or explain why a problem that is hard in theory can be solved efficiently in practice. Existing algorithms for finding backdoors rely on sampling candidate variable subsets in various ways, an approach which has demonstrated the existence of backdoors for some instances from MIPLIB2003 and MIPLIB2010. However, these algorithms fall short of consistently succeeding at the task due to an imbalance between exploration and exploitation. We propose BaMCTS, a Monte Carlo Tree Search framework for finding backdoors to MIPs. Extensive algorithmic engineering, hybridization with traditional MIP concepts, and close integration with the CPLEX solver have enabled our method to outperform baselines on MIPLIB2017 instances, finding backdoors more frequently and more efficiently.

#15 Learning to Search in Local Branching [PDF] [Copy] [Kimi]

Authors: Defeng Liu ; Matteo Fischetti ; Andrea Lodi

Finding high-quality solutions to mixed-integer linear programming problems (MILPs) is of great importance for many practical applications. In this respect, the refinement heuristic local branching (LB) has been proposed to produce improving solutions and has been highly influential for the development of local search methods in MILP. The algorithm iteratively explores a sequence of solution neighborhoods defined by the so-called local branching constraint, namely, a linear inequality limiting the distance from a reference solution. For a LB algorithm, the choice of the neighborhood size is critical to performance. Although it was initialized by a conservative value in the original LB scheme, our new observation is that the "best" size is strongly dependent on the particular MILP instance. In this work, we investigate the relation between the size of the search neighborhood and the behavior of the underlying LB algorithm, and we devise a leaning-based framework for guiding the neighborhood search of the LB heuristic. The framework consists of a two-phase strategy. For the first phase, a scaled regression model is trained to predict the size of the LB neighborhood at the first iteration through a regression task. In the second phase, we leverage reinforcement learning and devise a reinforced neighborhood search strategy to dynamically adapt the size at the subsequent iterations. We computationally show that the neighborhood size can indeed be learned, leading to improved performances and that the overall algorithm generalizes well both with respect to the instance size and, remarkably, across instances.

#16 Analysis of Pure Literal Elimination Rule for Non-uniform Random (MAX) k-SAT Problem with an Arbitrary Degree Distribution [PDF] [Copy] [Kimi]

Authors: Oleksii Omelchenko ; Andrei A. Bulatov

MAX k-SAT is one of the archetypal NP-hard problems. Its variation called random MAX k-SAT problem was introduced in order to understand how hard it is to solve instances of the problem on average. The most common model to sample random instances is the uniform model, which has received a large amount of attention. However, the uniform model often fails to capture important structural properties we observe in the real-world instances. To address these limitations, a more general (in a certain sense) model has been proposed, the configuration model, which is able to produce instances with an arbitrary distribution of variables' degrees, and so can simulate biases in instances appearing in various applications. Our overall goal is to expand the theory built around the uniform model to the more general configuration model for a wide range of degree distributions. This includes locating satisfiability thresholds and analysing the performance of the standard heuristics applied to instances sampled from the configuration model. In this paper we analyse the performance of the pure literal elimination rule. We provide an equation that given an underlying degree distribution gives the number of clauses the pure literal elimination rule satisfies w.h.p. We also show how the distribution of variable degrees changes over time as the algorithm is being executed.

#17 The SoftCumulative Constraint with Quadratic Penalty [PDF] [Copy] [Kimi]

Authors: Yanick Ouellet ; Claude-Guy Quimper

The Cumulative constraint greatly contributes to the success of constraint programming at solving scheduling problems. The SoftCumulative, a version of the Cumulative where overloading the resource incurs a penalty is, however, less studied. We introduce a checker and a filtering algorithm for the SoftCumulative, which are inspired by the powerful energetic reasoning rule for the Cumulative. Both algorithms can be used with classic linear penalty function, but also with a quadratic penalty function, where the penalty of overloading the resource increases quadratically with the amount of the overload. We show that these algorithms are more general than existing algorithms and vastly outperform a decomposition of the SoftCumulative in practice.

#18 Efficient Vertex-Oriented Polytopic Projection for Web-Scale Applications [PDF] [Copy] [Kimi]

Authors: Rohan Ramanath ; S. Sathiya Keerthi ; Yao Pan ; Konstantin Salomatin ; Kinjal Basu

We consider applications involving a large set of instances of projecting points to polytopes. We develop an intuition guided by theoretical and empirical analysis to show that when these instances follow certain structures, a large majority of the projections lie on vertices of the polytopes. To do these projections efficiently we derive a vertex-oriented incremental algorithm to project a point onto any arbitrary polytope, as well as give specific algorithms to cater to simplex projection and polytopes where the unit box is cut by planes. Such settings are especially useful in web-scale applications such as optimal matching or allocation problems. Several such problems in internet marketplaces (e-commerce, ride-sharing, food delivery, professional services, advertising, etc.), can be formulated as Linear Programs (LP) with such polytope constraints that require a projection step in the overall optimization process. We show that in some of the very recent works, the polytopic projection is the most expensive step and our efficient projection algorithms help in gaining massive improvements in performance.

#19 A Variant of Concurrent Constraint Programming on GPU [PDF] [Copy] [Kimi]

Authors: Pierre Talbot ; Frédéric G Pinel ; Pascal Bouvry

The number of cores on graphical computing units (GPUs) is reaching thousands nowadays, whereas the clock speed of processors stagnates. Unfortunately, constraint programming solvers do not take advantage yet of GPU parallelism. One reason is that constraint solvers were primarily designed within the mental frame of sequential computation. To solve this issue, we take a step back and contribute to a simple, intrinsically parallel, lock-free and formally correct programming language based on concurrent constraint programming. We then re-examine parallel constraint solving on GPUs within this formalism, and develop Turbo, a simple constraint solver entirely programmed on GPUs. Turbo validates the correctness of our approach and compares positively to a parallel CPU-based solver.

#20 Real-Time Driver-Request Assignment in Ridesourcing [PDF] [Copy] [Kimi]

Authors: Hao Wang ; Xiaohui Bei

Online on-demand ridesourcing service has played a huge role in transforming urban transportation. A central function in most on-demand ridesourcing platforms is to dynamically assign drivers to rider requests that could balance the request waiting times and the driver pick-up distances. To deal with the online nature of this problem, existing literature either divides the time horizon into short windows and applies a static offline assignment algorithm within each window or assumes a fully online setting that makes decisions for each request immediately upon its arrival. In this paper, we propose a more realistic model for the driver-request assignment that bridges the above two settings together. Our model allows the requests to wait after their arrival but assumes that they may leave at any time following a quitting function. Under this model, we design an efficient algorithm for assigning available drivers to requests in real-time. Our algorithm is able to incorporate future estimated driver arrivals into consideration and make strategic waiting and matching decisions that could balance the waiting time and pick-up distance of the assignment. We prove that our algorithm is optimal ex-ante in the single-request setting, and demonstrate its effectiveness in the general multi-request setting through experiments on both synthetic and real-world datasets.

#21 Encoding Multi-Valued Decision Diagram Constraints as Binary Constraint Trees [PDF] [Copy] [Kimi]

Authors: Ruiwei Wang ; Roland H.C. Yap

Ordered Multi-valued Decision Diagram (MDD) is a compact representation used to model various constraints, such as regular constraints and table constraints. It can be particularly useful for representing ad-hoc problem specific constraints. Many algorithms have been proposed to enforce Generalized Arc Consistency (GAC) on MDD constraints. In this paper, we introduce a new compact representation called Binary Constraint Tree (BCT). We propose tree binary encodings to transform any MDD constraint into a BCT constraint. We also present a specialized algorithm enforcing GAC on the BCT constraint resulting from a MDD constraint. Experimental results on a large set of benchmarks show that the BCT GAC algorithm can significantly outperform state-of-the-art MDD as well as table GAC algorithms.

#22 Sample Average Approximation for Stochastic Optimization with Dependent Data: Performance Guarantees and Tractability [PDF] [Copy] [Kimi]

Authors: Yafei Wang ; Bo Pan ; Wei Tu ; Peng Liu ; Bei Jiang ; Chao Gao ; Wei Lu ; Shangling Jui ; Linglong Kong

Sample average approximation (SAA), a popular method for tractably solving stochastic optimization problems, enjoys strong asymptotic performance guarantees in settings with independent training samples. However, these guarantees are not known to hold generally with dependent samples, such as in online learning with time series data or distributed computing with Markovian training samples. In this paper, we show that SAA remains tractable when the distribution of unknown parameters is only observable through dependent instances and still enjoys asymptotic consistency and finite sample guarantees. Specifically, we provide a rigorous probability error analysis to derive 1 - beta confidence bounds for the out-of-sample performance of SAA estimators and show that these estimators are asymptotically consistent. We then, using monotone operator theory, study the performance of a class of stochastic first-order algorithms trained on a dependent source of data. We show that approximation error for these algorithms is bounded and concentrates around zero, and establish deviation bounds for iterates when the underlying stochastic process is phi-mixing. The algorithms presented can be used to handle numerically inconvenient loss functions such as the sum of a smooth and non-smooth function or of non-smooth functions with constraints. To illustrate the usefulness of our results, we present several stochastic versions of popular algorithms such as stochastic proximal gradient descent (S-PGD), stochastic relaxed Peaceman-Rachford splitting algorithms (S-rPRS), and numerical experiment.

#23 A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes [PDF] [Copy] [Kimi]

Authors: Honghao Wei ; Xin Liu ; Lei Ying

This paper presents a model-free reinforcement learning (RL) algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs). Considering a learning horizon K, which is sufficiently large, the proposed algorithm achieves sublinear regret and zero constraint violation. The bounds depend on the number of states S, the number of actions A, and two constants which are independent of the learning horizon K.

#24 TextHoaxer: Budgeted Hard-Label Adversarial Attacks on Text [PDF] [Copy] [Kimi]

Authors: Muchao Ye ; Chenglin Miao ; Ting Wang ; Fenglong Ma

This paper focuses on a newly challenging setting in hard-label adversarial attacks on text data by taking the budget information into account. Although existing approaches can successfully generate adversarial examples in the hard-label setting, they follow an ideal assumption that the victim model does not restrict the number of queries. However, in real-world applications the query budget is usually tight or limited. Moreover, existing hard-label adversarial attack techniques use the genetic algorithm to optimize discrete text data by maintaining a number of adversarial candidates during optimization, which can lead to the problem of generating low-quality adversarial examples in the tight-budget setting. To solve this problem, in this paper, we propose a new method named TextHoaxer by formulating the budgeted hard-label adversarial attack task on text data as a gradient-based optimization problem of perturbation matrix in the continuous word embedding space. Compared with the genetic algorithm-based optimization, our solution only uses a single initialized adversarial example as the adversarial candidate for optimization, which significantly reduces the number of queries. The optimization is guided by a new objective function consisting of three terms, i.e., semantic similarity term, pair-wise perturbation constraint, and sparsity constraint. Semantic similarity term and pair-wise perturbation constraint can ensure the high semantic similarity of adversarial examples from both comprehensive text-level and individual word-level, while the sparsity constraint explicitly restricts the number of perturbed words, which is also helpful for enhancing the quality of generated text. We conduct extensive experiments on eight text datasets against three representative natural language models, and experimental results show that TextHoaxer can generate high-quality adversarial examples with higher semantic similarity and lower perturbation rate under the tight-budget setting.

#25 Two Compacted Models for Efficient Model-Based Diagnosis [PDF] [Copy] [Kimi]

Authors: Huisi Zhou ; Dantong Ouyang ; Xiangfu Zhao ; Liming Zhang

Model-based diagnosis (MBD) with multiple observations is complicated and difficult to manage over. In this paper, we proposed two new diagnosis models, namely, the Compacted Model with Multiple Observations (CMMO) and the Dominated-based Compacted Model with Multiple Observations (D-CMMO), to solve the problem in which a considerable amount of time is needed when multiple observations are given and more than one fault is injected. Three ideas are presented in this paper. First, we propose to encode MBD with each observation as a subsystem and share as many system variables as possible to compress the size of encoded clauses. Second, we utilize the notion of gate dominance in the CMMO approach to compute Top-Level Diagnosis with Compacted Model (CM-TLD) to reduce the solution space. Finally, we explore the performance of our model using three fault models. Experimental results on the ISCAS-85 benchmarks show that CMMO and D-CMMO perform better than the state-of-the-art algorithms.